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Abstract 

The menu-dependent nature of regret-minimization creates subtleties when it is applied 
to dynamic decision problems. It is not clear whether forgone opportunities should be 
included in the menu. We explain commonly observed behavioral patterns as minimizing 
regret when forgone opportunities are present. If forgone opportunities are included, we can 
characterize when a form of dynamic consistency is guaranteed. 


1 Introduction 


Savage [1951 and Anscombe and Aumann [19631 showed that a decision maker maximizing 
expected utility with respect to a probability measure over the possible states of the world is 
characterized by a set of arguably desirable principles. However, as Allais [1953 and Ellsberg 
1961 point out using compelling examples, sometimes intuitive choices are incompatible with 


maximizing expected utility. One reason for this incompatibility is that there is often ambiguity 
in the problems we face; we often lack sufficient information to capture all uncertainty using a 
single probability measure over the possible states. 

To this end, there is a rich literature offering alternative means of making decisions (see, 
e.g., Al-Najjar and Weinstein 2009 for a survey). For example, we might choose to represent 


uncertainty using a set of possible states of the world, but using no probabilistic information 
at all to represent how likely each state is. With this type of representation, two well-studied 
rules for decision-making are maximin utility and minimax regret. Maximin says that you 
should choose the option that maximizes the worst-case payoff, while minimax regret says that 
you should choose the option that minimizes the regret you’ll feel at the end, where, roughly 
speaking, regret is the difference between the payoff you achieved, and the payoff that you 
could have achieved had you known what the true state of the world was. Both maximin and 
nrinimax regret can be extended naturally to deal with other representations of uncertainty. For 
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example, with a set of probability measures over the possible states, minimax regret becomes 


minimax expected regret (MER) Hayashi 2011 

Stoye 2011 . Other works that use a set of 

probablity measures include, for example, Campos and Moral 1995 

Cousa, Moral, and Walley 

1999 

Gilboa and Schmeidler 1993; 

Levi 1985| 

Walley 1991 . 


In this paper, we consider a generalization of minimax expected regret called minimax 
weighted expected regret (MWER) that we introduced in an earlier paper Halpern and Leung 


2012 . For MWER, uncertainty is represented by a set of weighted probability measures. Intu¬ 


itively, the weight represents how likely the probability measure is to be the true distribution 
over the states, according to the decision maker (henceforth DM). The weights work much like 
a “second-order” probability on the set of probability measures. Similar ideas can be dated 
back to at least Gardenfors and Sahlin [1982, 1983] ; see also [Good 19~80 for discussion and fur¬ 


ther references. Walley |1997 suggested putting a possibility measure Dubois and Prade 1998 
?] on probability measures; this was also essentially done by Cattaneo |2007 , Chateauneuf 
and Faro 120091, and de Coornan [2005 . All of these authors and others (e.g., Klibanoff et al. 


2005 ; Maccheroni et al. 120061; Nau 119921) proposed approaches to decision making using 


their representations of uncertainty. 


Real-life problems are often dynamic, with many stages where actions can be taken; in¬ 
formation can be learned over time. Before applying regret minimization to dynamic decision 
problems, there is a subtle issue that we must consider. In static decision problems, the regret 
for each act is computed with respect to a menu. That is, each act is judged against the other 
acts in the menu. Typically, we think of the menu as consisting of the feasible acts, that is, the 
ones that the DM can perform. The analogue in a dynamic setting would be the feasible plans, 
where a plan is just a sequence of actions leading to a final outcome. In a dynamic decision 
problem, as more actions are taken, some plans become forgone opportunities. These are plans 
that were initially available to the DM, but are no longer available due to earlier actions of the 
DM. Since regret intuitively captures comparison of a choice against its alternatives, it seems 
reasonable for the menu to include all the feasible plans at the point of decision-making. But 
should the menu include forgone opportunities? 


Consequentialists would argue that it is irrational to care about forgone opportunities Ham¬ 


mond 1976; Machina 19891; we should simply focus on the opportunities that are still available 
to us, and thus not include forgone opportunities in the menu. And, indeed, when regret has 
been considered in dynamic settings thus far (e.g., by Hayashi |201l]), the menu has not in¬ 
cluded forgone opportunities. However, introspection tells us that we sometimes do take forgone 
opportunities into account when we feel regret. For example, when considering a new job, one 
might compare the available options to what might have been available if one had chosen a 
different career path years ago. As we show, including forgone opportunities in the menu can 
make a big difference in behavior. Consider procrastination: we tell ourselves that we will 
start studying for an exam (or start exercising, or quit smoking) tomorrow; and then tomorrow 
comes, and we again tell ourselves that we will do it, starting tomorrow. This behavior is hard 
to explain with standard decision-theoretic approaches, especially when we assume that no new 
information about the world is gained over time. However, we give an example where, if forgone 
opportunities are not included in the menu, then we get procrastination; if they are, then we 
do not get procrastination. 


This example can be generalized. Procrastination is an example of preference reversal: the 
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DM’s preference at time t for what he should do at time t + 1 reverses when she actually gets 
to time t + 1. We prove in Section [3] that if the menu includes forgone opportunities and the 
DM acquires no new information over time (as is the case in the procrastination problem), 
then a DM who uses regret to make her decisions will not suffer preference reversals. Thus, we 
arguably get more rational behavior when we include forgone opportunities in the menu. 


What happens if the DM does get information over time? It is well known that, in this 
setting, expected utility maximizers are guaranteed to have no preference reversals. Epstein and 
Le Breton 11993 have shown that, under minimal assumptions, to avoid preference reversals, the 
DM must be an expected utility maximizer. On the other hand, Epstein and Schneider 12003 J 


show that a DM using MMEU never has preference reversals if her beliefs satisfy a condition they 
call rectangularity. Hayashi 2011 shows that rectangularity also prevents preference reversals 
for MER under certain assumptions. Unfortunately, the rectangularity condition is often not 
satisfied in practice. Other conditions have been provided that guarantee dynamic consistency 
for ambiguity-averse decision rules (see, e.g., [Al-Na jjar and Weinstein 2009] for an overview). 

We consider the question of preference reversal in the context of regret. Hayashi (2011] has 
observed that, in dynamic decision problems, both changes in menu over time and updates to 
the DM’s beliefs can result in preference reversals. In Section |4j we show that keeping forgone 
opportunities in the menu is necessary in order to prevent preference reversals. But, as we show 
by example, it is not sufficient if the DM acquires new information over time. We then provide a 
condition on the beliefs that is necessary and sufficient to guarantee that a DM making decisions 
using MWER whose beliefs satisfy the condition will not have preference reversals. However, 
because this necessary and sufficient condition may not be easy to check, we also give simpler 
sufficient condition, similar in spirit to Epstein and Schneider’s 2003 rectangularity condition. 
Since MER can be understood as a special case of MWER where all weights are either 1 or 0, 
our condition for dynamic consistency is also applicable to MER. 

The remainder of the paper is organized as follows. Section [2] discuss preliminaries. Section[3] 
introduces forgone opportunities. Section [4] gives conditions under which consistent planning is 
not required. We conclude in Section [5j We defer most proofs to the appendix. 


2 Preliminaries 

2.1 Static decision setting and regret 

Given a set S of states and a set X of outcomes, an act f (over S and X) is a function mapping 
S to X. We use T to denote the set of all acts. For simplicity in this paper, we take S to be 
finite. Associated with each outcome x E X is a utility: u(x) is the utility of outcome x. We 
call a tuple ( S , X, u) a (non-probabilistic) decision problem. To define regret, we need to assume 
that we are also given a set M C T of acts, called the menu. The reason for the menu is that, 
as is well known, regret can depend on the menu. We assume that every menu M has utilities 
bounded from above. That is, we assume that for all menus M, sup geM u(g(s)) is finite. This 
ensures that the regret of each act is well defined. For a menu M and act / E M, the regret of 
/ with respect to M and decision problem (S, X, u) in state s is 

reg M {f,s ) = sup u(g(s)) - u(f(s )). 

\geM I 
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That is, the regret of / in state s (relative to menu M) is the difference between u(f(s)) and 
the highest utility possible in state s among all the acts in M. The regret of / with respect 
to M and decision problem ( S,X,u ), denoted reg^’ X ’ u \f), is the worst-case regret over all 
states: 

reg ( M X ' U \f) = ma x.reg M (f,s). 

We typically omit superscript ( S,X,u ) in reg^f X ’ u \f)\i it is clear from context. The mini¬ 
max regret decision rule chooses an act that minimizes maxes' reg M (f, s). In other words, the 
minimax regret choice function is 


Cm(M') = argmin max reg M (/,s). 
feM' 


The choice function returns the set of all acts in M' that minimize regret with respect to M. 
Note that we allow the menu M ', the set of acts over which we are minimizing regret, to be 
different from the menu M of acts with respect to which regret is computed. For example, if 
the DM considers forgone opportunities, they would be included in M. although not in M'. 

If there is a probability measure Pr over the e-algebra £ on the set S of states, then we 
can consider the probabilistic decision problem (S, £, X, u, Pr). The expected regret of / with 
respect to M is 

regnif) = J2 Fl ^ re9 M(f’ s )- 

seS 

If there is a set V of probability measures over the e-algebra £ on the set S of states, states, 
then we consider the P-decision problem V = (S, £, X, u, V). The maximum expected regret 
of / E M with respect to M and T> is 


reghtf) = sup 
PreP 


J^Pr (s)reg M (f,s) 

s£S 


The minimax expected regret (MER) decision rule minimizes reg^(f). 

In an earlier paper, we introduced another representation of uncertainty, weighted set of 
probability measures [Halp ern and Leung 20121. A weighted set of probability measures gen¬ 


eralizes a set of probability measures by associating each measure in the set with a weight, 
intuitively corresponding to the reliability or significance of the measure in capturing the true 
uncertainty of the world. Minimizing weighted expected regret with respect to a weighted set 
of probability measures gives a variant of minimax regret, called Minimax Weighted Expected 
Regret (MWER). A set V + of weighted probability measures on (S, £) consists of pairs (Pr, cep r ), 
where ap r E [0,1] and Pr is a probability measure on (S, £). Let V = {Pr : 3a(Pr,a) E V + }. 
We assume that, for each Pr E V, there is exactly one a such that (Pr, a) E V + . We denote 
this number by ap r , and view it as the weight of Pr. We further assume for convenience that 
weights have been normalized so that there is at least one measure Pr E V such that ap r = 1. 

If beliefs are modeled by a set V + of weighted probabilities, then we consider the T’ + -decision 
problem V+ = (S, X , u.,V + ). The maximum weighted expected regret of f E M with respect 
to M and V+ = (5, X, u, V + ) is 


re-OM (/) = sup 

(Pr,a)£"P+ 


Pr (s)reg M (f,s) 


s£S 
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If V + is empty, then reg^ is identically zero. Of course, we can define the choice functions 
C?r Pr , C M ,r , and C r ^' VJr using reg p K }, reg and reg 1 ^, by analogy with C^ 9 . 

2.2 Dynamic decision problems 


A dynamic decision problem is a single-player extensive-form game where there is some set S 
of states, nature chooses s £ S at the first step, and does not make any more moves. The DM 
then performs a finite sequence of actions until some outcome is reached. Utility is assigned to 
these outcomes. A history is a sequence recording the actions taken by nature and the DM. At 
every history h, the DM considers possible some other histories. The DM’s information set at 
h, denoted 1(h), is the set of histories that the DM considers possible at h. Let s(h) denote the 
initial state of h (i.e., nature’s first move); let R(h) denote all the moves the DM made in h after 
nature’s first move; finally, let E(h) denote the set of states that the DM considers possible at 
h; that is, E(h ) = {s(h') : h' £ 1(h)}. We assume that the DM has perfect recall: this means 
that R(h') = R(h) for all h' £ 1(h), and that if h' is a prefix of h, then E(h') D E(h). 

A plan is a (pure) strategy: a mapping from histories to histories that result from taking 
the action specified by the plan. We require that a plan specify the same action for all histories 
in an information set; that is, if / is a plan, then for all histories h and h' £ 1(h), we must 
have the last action in f(h) and f(h!) must be the same (so that R(f(h)) = R(f(h'))). Given 
an initial state s, a plan determines a complete path to an outcome. Hence, we can also view 
plans as acts: functions mapping states to outcomes. We take the acts in a dynamic decision 
problem to be the set of possible plans, and evaluate them using the decision rules discussed 
above. 


A major difference between our model and that used by Epstein and Schneider 2003 and 
Hayashi 12009 is that the latter assume a filtration information structure. With a filtration in¬ 
formation structure, the DM’s knowledge is represented by a fixed, finite sequence of partitions. 
More specifically, at time t, the DM uses a partition F(t) of the state space, and if the true 
state is s, then all that the DM knows is that the true state is in the cell of F(t) containing s. 
Since the sequence of partitions is fixed, the DM’s knowledge is independent of the choices that 
she makes, and her options and preferences cannot depend on past choices. This assumption 
significantly restricts the types of problems that can be naturally modeled. For example, if 
the DM prefers to have one apple over two oranges at time t, then this must be her time t 
preference, regardless of whether she has already consumed five apples at time t— 1. Moreover, 
consuming an apple at time t cannot preclude consuming an apple at time t + 1. Since we 
effectively represent a decision problem as a single-player extensive-form game, we can capture 
all of these situations in a straightforward way. The models of Epstein, Schneider, and Hayashi 
can be viewed as a special case of our model. 


In a dynamic decision problem, as we shall see, two different menus are relevant for making 
a decision using regret-minimization: the menu with respect to which regrets are computed, 
and the menu of feasible choices. We formalize this dependence by considering choice functions 
of the form Cm,e , where E,M 0. Cm,e is a function mapping a nonempty menu M' to 
a nonempty subset of M'. Intuitively, Cm,e(M') consists of the DM’s most preferred choices 
from the menu M' when she considers the states in E possible and her decision are made 
relative to menu M. (So, for example, if the DM is making her choices choices using regret 
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minimization, the regret is taken with respect to M.) Note that there may be more than 
one plan in Cm,e(M')\ intuitively, this means that the DM does not view any of the plans in 
Cm,e(M') as strictly worse than some other plan. 

What should M and E be when the DM makes a decision at a history hi We always take 
E = E(h). Intuitively, this says that all that matters about a history as far as making a decision 
is the set of states that the DM considers possible; the previous moves made to get to that 
history are irrelevant. As we shall see, this seems reasonable in many examples. Moreover, it 
is consistent with our choice of taking probability distributions only on the state space. 

The choice of M is somewhat more subtle. The most obvious choice (and the one that has 
typically been made in the literature, without comment) is that M consists of the plans that 
are still feasible at h, where a plan / is feasible at a history h if, for all strict prefixes h' of h, 
f(h') is also a prefix of h. So / is feasible at h if h is compatible with all of /’s moves. Let M} t 
be the set of plans feasible at h. While taking M = Mh is certainly a reasonable choice, as we 
shall see, there are other reasonable alternatives. 


Before addressing the choice of menu in more detail, we consider how to apply regret in 
a dynamic setting. If we want to apply MER or MWER, we must update the probability 
distributions. Epstein and Schneider 120031 and Hayashi 120091 consider prior-by-prior updating , 
the most common way to update a set of probability measures, defined as follows: 


V\ P E = {Pr \E : Pr G V, Pr (E) > 0}. 

We can also apply prior-by-prior updating to a weighted set of probabilities: 

V + \ p E = {(Pr | E, a) : (Pr, a) G P+, Pr(E) > 0}. 


Prior-by-prior updating can produce some rather counter-intuitive outcomes. For example, 
suppose we have a coin of unknown bias in [0.25, 0.75], and flip it 100 times. We can represent 
our prior beliefs using a set of probability measures. However, if we use prior-by-prior updating, 
then after each flip of the coin the set P+ representing the DM’s beliefs does not change, because 
the beliefs are independent. Thus, in this example, prior-by-prior updating is not capturing the 
information provided by the flips. 


We consider another way of updating weighted sets of probabilities, called likelihood updating 
Halpern and Leung 20121. The intuition is that the weights are updated as if they were a second- 
order probability distribution over the probability measures. Given an event ECS', define 
V + (E) =sup{aPr(E) : (Pr, a) <G P+}; if V + (E) > 0, let a l E = sup {(Pr / |Q ,, )e p+ :Pr / | g=Pr [£} ■ 

Given a measure Pr G V, there may be several distinct measures Pr 7 in V such that Pr 7 1 E = 

Pr | E. Thus, we take the weight of Pr |E to be the sup of the possible candidate values of a l E . 

By dividing by P + (E), we guarantee that a l E G [0,1], and that there is some weighted measure 
(Pr, a) such that a l E = 1, as long as there is some pair (Pr, a) G V + such that a Pr (E) = V ( E). 

If V + (E) > 0, we take V + \ l E, the result of applying likelihood updating by E to V + , to be 


{(Pr | E, a l E ) : (Pr, a) G P+, Pr(E) > 0}. 


In computing V + \ l E, we update not just the probability measures in Pr G P, but also 
their weights, which are updated to a l E . Although prior-by-prior updating does not change the 
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weights, for purposes of exposition, given a weighted probability measure (Pr, a), we use a p E to 
denote the “updated weight” of Pr | E £ V + \ P E ; of course, a p E = a. 

Intuitively, probability measures that are supported by the new information will get larger 
weights using likelihood updating than those not supported by the new information. Clearly, if 
all measures in V start off with the same weight and assign the same probability to the event 
E, then likelihood updating will give the same weight to each probability measure, resulting 
in measure-by-measure updating. This is not surprising, since such an observation E does not 
give us information about the relative likelihood of measures. 

-p+ 1 l jpj 

Let reg M (/) denote the regret of act / computed with respect to menu M and beliefs 
V + \ l E. If V + \ l E is empty (which will be the case if V + (E) = 0) then reg^ ^ E (f) = 0 for all 

/ T>+ Ip rp 

acts /. We can similarly define reg M (/) for beliefs updated using prior-by-prior updating. 
Also, let C r A T V ^ E (M') be the set of acts in M' that minimize the weighted expected regret 
reg^i ' E . If V + \ l E is empty, then C™ P ' V ' E (M') = M'. We can similarly define C r ^' V ^ E , 

/~ireg,V\E , ^reg,Pi\E 
U M alla 

3 Forgone opportunities 

As we have seen, when making a decision at a history h in a dynamic decision problem, the 
DM must decide what menu to use. In this section we focus on one choice. Take a forgone 
opportunity to be a plan that was initially available to the DM, but is no longer available due 
to earlier actions. As we observed in the introduction, while it may seem irrational to consider 
forgone opportunities, people often do. Moreover, when combined with regret, behavior that 
results by considering forgone opportunities may be arguably more rational than if forgone 
opportunities are not considered. Consider the following example. 

Example 3.1. Suppose that a student has an exam in two days. She can either start studying 
today, play today and then study tomorrow, or just play on both days and never study. There 
are two states of nature: one where the exam is difficult, and one where the exam is easy. The 
utilities reflect a combination of the amount of pleasure that the student derives in the next 
two days, and her score on the exam relative to her classmates. Suppose that the first day of 
play gives the student p\ > 0 utils, and the second day of play gives her p 2 > 0 utils. Her 
exam score affects her utility only in the case where the exam is hard and she studies both 
days, in which case she gets an additional g\ utils for doing much better than everyone else, 
and in the case where the exam is hard and she never studies, in which case she loses 52 > 0 
utils for doing much worse than everyone else. Figure [l] provides a graphical representation of 
the decision problem. Since, in this example, the available actions for the DM are independent 
of nature’s move, for compactness, we omit nature’s initial move (whether the exam is easy or 
hard). Instead, we describe the payoffs of the DM as a pair [ 01 , 02 ], where ai is the payoff if 
the exam is hard, and 02 is the payoff if the exam is easy. 

Assume that 2p\ + P 2 > gi > Pi + P 2 and 2p2 >52 > P 2 - That is, if the test were hard, the 
student would be happier studying and doing well on the test than she would be if she played 
for two days, but not too much happier; similarly, the penalty for doing badly in the exam if the 
exam is hard and she does not study is greater than the utility of playing the second day, but 
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Figure 1: An explanation for procrastination. 


not too much greater. Suppose that the student uses minimax regret to make her decision. On 
the first day, she observes that playing one day and then studying the next day has a worst-case 
regret of g\ — pi, while studying on both days has a worst-case regret of pi+P2- Therefore, she 
plays on the first day. On the next day, suppose that she does not consider forgone opportunities 
and just compares her two available options, studying and playing. Studying has a worst-case 
regret of P2, while playing has a worst-case regret of <72 — P2, so, since 52 < %P2, she plays again 
on the second day. On the other hand, if the student had included the forgone opportunity in 
the menu on the second day, then studying would have regret g\ — p\ , while playing would have 
regret g\ + 52 ~ Pi — P2- Since 52 > P2, studying minimizes regret. □ 


Example 3.1 emphasizes the roles of the menus M and M' in Cm,e{M'). Here we took 
M, the menu relative to which choices were evaluated, to consist of all plans, even the ones 
that were no longer feasible, while M' consisted of only feasible plans. In general, to determine 
the menu component M of the choice function C M ^ E ^ used at a history h, we use a menu- 
selection function p. The menu p(h) is the menu relative to which choice are computed at h. 
We sometimes write rather than C^ h ^ E f h )- 

We can now formalize the notion of no preference reversal. Roughly speaking, this says 
that if a plan / is considered one of the best at history h and is still feasible at an extension h! 
of h, then / will still be considered one of the best plans at h!. 


Definition 3.2 (No preference reversal). A family of choice functions has no preference 
reversals if, for all histories h and all histories h! extending h, if f E C lt p{Mh) and f E My, 
then f E C^y(My). 


The fact that we do not get a preference reversal in Example |3.1| if we take forgone op¬ 
portunities into account here is not just an artifact of this example. As we now show, as long 
as we do not get new information and also use a constant menu (i.e., by keeping all forgone 
opportunities in the menu), then there will be no preference reversals if we minimize (weighted) 
expected regret in a dynamic setting. 

Proposition 3.3. If, for all histories h, h!, we have E{h ) = S and p(h ) = p{h'), and decisions 
are made according to MWER (i.e., the agent has a setV + of weighted probability distributions 
and a utility function u, and f E C^p(Mh) if f minimizes weighted expected regret with respect 
to V + \ l E{h) or V + \ p E(h)), then no preference reversals occur. 
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Hard 

Easy 


Short 

Long 

Short 

Long 

Pri 

1 

0 

0 

0 

Pr 2 

0 

0.2 

0.2 

0.2 

play-study 

1 

0 

5 

0 

play-play 

0 

3 

0 

3 


Table 1: ap ri = l,ap r2 = 0.6. 


Proof. Suppose that / £ C M / S \, h is a history extending (s), and / £ M^. Since E(h) = 5 
and n{h) = ti{(s)) by assumption, we have = C^ s) ^ E ^ s)) . By assumption, / £ 

^ is easy to check that MWER satisfies what is known 
1988): if / £ M' C M" and / £ C m ,e{ m "), then 
That is, if / is among the most preferred acts in menu M". if / is in the 
smaller menu M', then it must also be among the most preferred acts in menu M'. Because 
/ £ M h C M {s) and / £ C^ S ){M^), we have / £ C^ h ) jE ^(M h ), as required. □ 


in decision theory as Sen’s a axiom 
f € Cm,e{M') 


Proposition 3.3 shows that we cannot have preference reversals if the DM does not learn 
about the world. However, if the DM learns about the world, then we can have preference 
reversals. Suppose, as is depicted in Table [lj that in addition to being hard and easy, the exam 
can also be short or long. The student’s beliefs are described by the set of weighted probabilities 
Pri and Pr 2 , with weights 1 and 0.6, respectively. 


We take the option of studying on both days out of the picture by assuming that its utility 
is low enough for it to never be preferred, and for it to never affect the regret computations. 
After the first day, the student learns whether the exam will be hard or easy. One can verify 
that the ex ante regret of playing then studying is lower than that of playing on both days, 
while after the first day, the student prefers to play on the second day, regardless of whether 
she learns that the exam is hard or easy. 


4 Characterizing no preference reversal 

We now consider conditions under which there is no preference reversal in a more general 
setting, where the DM can acquire new information. While including all forgone opportunities 
is no longer a sufficient condition to prevent preference reversals, it is necessary, as the following 
example shows: Consider the two similar decision problems depicted in Figure [2j Note that 
at the node after first playing L, the utilities and available choices are identical in the two 
problems. If we ignore forgone opportunities, the DM necessarily makes the same decision in 
both cases if his beliefs are the same. However, in the tree to the left, the ex ante optimal plan 
is LR, while in the tree to the right, the ex ante optimal plan is LL. If the DM ignores forgone 
opportunities, then after the first step, she cannot tell whether she is in the decision tree on 
the left side, or the one on the right side. Therefore, if she follows the ex ante optimal plan in 
one of the trees, she necessarily is not following the ex ante optimal plan in the other tree. 

In light of this example, we now consider what happens if the DM learns information over 
time. Our no preference reversal condition is implied by a well-studied notion called dynamic 
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consistency. One way of describing dynamic consistency is that a plan considered optimal at 
a given point in the decision process is also optimal at any preceding point in the process, as 
well as any future point that is reached with positive probability Siniscalchi 2011 . For menu- 
independent preferences, dynamic consistency is usually captured axiomatically by variations 
of an axiom called Dynamic Consistency (DC) or the Sure Thing Principle |Savage 1954 . We 
define a menu-dependent version of DC relative to events E and F using the following axiom. 
The second part of the axiom implies that if / is strictly preferred conditional on E fl F and at 
least weakly preferred on E c n F, then / is also strictly preferred on F. An event E is relevant 
to a dynamic decision problem V if it is one of the events that the DM can potentially learn 
in D, that is, if there exists a history h such that E{h ) = E. A dynamic decision problem 
D = (S', £, X,u,V) is “proper” if £ is generated by the subsets of S relevant to D. Given a 
decision problem D, we take the measurable sets to be the a-algebra generated by the events 
relevant to D. The following axioms hold for all measurable sets E and F. menus M and M ', 
and acts / and g. 


Axiom 1 (DC-M). If f E CM,EnF(M') n CM,E c nF{M') , then f E Cm,f(M'). If, furthermore, 
9 & Cm,ecf(M'), then g ^ Cm,f(M')- 

Axiom 2 (Conditional Preference). If f and g, when viewed as acts, give the same outcome 
on all states in E, then f E Cm,e(M') iff 9 E Cm,e(M'). 


The next two axioms put some weak restrictions on choice functions. 

Axiom 3. Cm,e{M') C M' and Cm,e(M') 0 if M' / 0 . 

Axiom 4 (Sen’s a). If f E Cm,e{M') and M" C M', then f E Cm,e{M"). 

Theorem 4.1. For a dynamic decision problem D, if Axiom [7j-[^] hold and g(h) = M for some 
fixed menu M, then there will be no preference reversals in D. 

We next provide a representation theorem that characterizes when Axioms [l]^4] hold for 
a MWER decision maker. The following condition says that the unconditional regret can be 
computed by separately computing the regrets conditional on measurable events E n F and on 
E C (1F. 

Definition 4.2 (SEP). The weighted regret of f with respect to M and V + is separable with 
respect to | x (x E {p,l}) if for all measurable sets E and F such that V (E n F) > 0 and 
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V (E c nF) > 0, 

re 5M +|XF (/) = SU P a ( Pr (-E n F)reg'l^ X{Ent \f) + Pr (E c n F)reg^^ E nF) 

(Pr ,a)£V+ V 

and if reg ^ ^ ( EaF )( f ) q ? then 

re 9 P M ] ' XF {f) > sup aPr(E c n F)reg^ lX{E nF \f). 

(Pr,a)e'P+ 


We now show that Axioms [l}|4] characterize SEP. Say that a decision problem V is based on 
(S, E) if T> = (S, E, X, u, V) for some X, u, and V. In the following results, we will also make 
use of an alternative interpretation of weighted probability measures. Define a subprobability 
measure p on (S, E) to be like a probability measure, in that it is a function mapping measurable 
subsets of S to [0,1] such that p(T U T’) = p(T) + p{T') for disjoint sets T and T', except that 
it may not satisfy the requirement that p(S) = 1. We can identify a weighted probability 
distribution (Pr, a) with the subprobability measure a Pr. (Note that given a subprobability 
measure p, there is a unique pair (a, Pr) such that p = aPr: we simply take a = p(S) and 
Pr = p/a.) Given a set V + of weighted probability measures, we let C(V + ) = {p > 0 : 
3c, 3 Pr, (c, Pr) € V + and p < cPr}. 

Theorem 4.3. If V + is a set of weighted distributions on (S, E) such that C(V + ) is closed, 
then the following are equivalent for x £ { p , l}: 


(a) For all decision problems D based on (S, E) and all menus M in D, Axioms hold for 
the family C T ^' P ^ E of choice functions. 

(b) For all decision problems D based on (S, E), states s G S, and acts f 6 M/ s \, the weighted 
regret of f with respect to M^ and V + is separable with respect to \ x . 


Note that Theorem 4.3 says that to check that Axioms 1-4 hold, we need to check only that 
separability holds for initial menus M/ s \. 

It is not hard to show that SEP holds if the set V is a singleton. But, in general, it is not 
obvious when a set of probability measures is separable. We thus provide a characterization 
of separability, in the spirit of Epstein and LeBreton’s 119931 rectangularity condition. We 
actually provide two conditions, one for the case of prior-by-prior updating, and another for 
the case of likelihood updating. These definitions use the notion of maximum weighted expected 
value of 9, defined as E-p+(8) = sup(p r Q ,) g -p+ aPr (s)8(s). We use X to denote the closure 

of a set X. 


Definition 4.4 (y-Rectangularity). A setV + of weighted probability measures is x-rectangular 
(x E {p,l}) if for all measurable sets E and F, 


(a) if (Pi’i, oq), (Pi' 2 , 02 ), (Pi’ 3 , < 23 ) E V + , Pri(E n F) > 0, and Pi' 2 (E c n F) > 0, then 

a 3 Pv 3 (E n F)al ErF Pn\(E n F) + a 3 Pv 3 (E c n F)a\ EcnF Pr 2 \(E c n F) E C(V+ \ x F), 
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(b) for all 8 > 0, ifV + {F) > 0, then there exists (Pr, a) 6 V + \ X F such that a{8 Pr(E n F) + 
Pr(E c n F)) > sup( Pr / )Q ,/) e -p+ a'Pr'(E c n F), and 

(c) for all nonnegative real vectors 6 E 

su P(Pr,«)e'P+pF a (P r (^ n F)E v +\ X ( EnF )(0) + Pr(E c n F)E v +\ x ^ E c nF ' ) (6 )) > E v +\ xF (0). 


Recall that Epstein and Schneider proved that rectangularity is a condition that guaran¬ 
tees no preference reversal in the case of MMEU Epstein and Schneider 2003] , and Hayashi 
proved a similar result for MER Hayashi 2009 . With MMEU and MER, only unweighted 


probabilities are considered. Definition 4.4 essentially gives the generalization of Epstein and 


Schneider’s condition to weighted probabilities. Part (a) of y-rectangularity is analogous to the 
rect angularity condition of Epstein and Schneider. Part (b) of y-rectangularity corresponds to 
the assumption that (E n F) is non-null, which is analogous to Axiom 5 in Epstein and Schnei¬ 
der’s axiomatization. Finally, part (c) of y-rectangularity holds for MMEU when weights are 
in {0,1}, and thus is not necessary for Epstein and Schneider. It is not hard to show that 
we can replace condition (a) above by the requirement that V + is closed under conditioning, 
in the sense that if (Pr, a) E V + , then so are (Pr \(E n F), a) and (Pr \(E C n F), a). 

As the following result shows, y-rectangularity is indeed sufficient to give us Axioms Si 
under prior-by-prior updating and likelihood updating. 

Theorem 4.5. If C(V + ) is closed and convex, then Axiom [/] holds for the family of choices 
C^ E if and only ifV + is x-rectangular. 

The proof that y-rectangularity implies Axiom [l] requires only that C(fP + ) be closed (i.e., 
convexity is not required). Hayashi [2011 proves an analogue of Theorem 4.5 for MER using 
prior-by-prior updating. He also essentially assumes that the menu includes forgone opportuni¬ 
ties, but his interpretation of forgone opportunities is quite different from ours. He also shows 
that if forgone opportunities are not included in the menu, then the set of probabilities repre¬ 
senting the DM’s uncertainty at all but the initial time must be a singleton. This implies that 
the DM must behave like a Bayesian at all but the initial time, since MER acts like expected 
utility maximization if the DM’s uncertainty is described by a single probability measure. 

Epstein and Le Breton 11993j took this direction even further and prove that, if a few axioms 
hold, then only Bayesian beliefs can be dynamically consistent. While Epstein and Le Breton’s 
result was stated in a menu-free setting, if we use a constant menu throughout the decision 
problem, then our model fits into their framework. At first glance, their impossibility result 
may seem to contradict our sufficient conditions for no preference reversal. However, Epstein 
and Le Breton’s impossibility result does not apply because one of their axioms, P4 C , does not 
hold for MER (or MWER). For ease of exposition, we give P4 C for static decision problems. 
Given acts / and g and a set T of states, let fTg be the act that agrees with / on T and agrees 
with g on T c . Given an outcome x, let x* be the constant act that gives outcome x at all states. 

Axiom 5 (Conditional weak comparative probability). For all events T,A,B, with A U B C 
T, outcomes w,x,y, and z, and acts g, if w*Tg >- x*Tg , z*Tg >- y*Tg, and ( w*Ax*)Tg F 
( w*Bx*)Tg, then ( z*Ay*)Tg F (, z*By*)Tg. 
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PA C implies Savage’s P 4, and does not hold for MER and MWER in general. For a simple 
counterexample, let S = {si, 52,53}, X = {01,05, 07,010, 020, 023}, A = {si}, B = {s 2 }, T = 
A U B, u(ok) = k, g is the act such that g(s\) = 020, g{s2) = 023, and g(ss) = 05. Let 
V = {pi,P2,P3}, where 

• pi(si) = 0.25 and pi(s 2 ) = 0.75; 

• 7*2(53) = 1 ; 

• 7 * 3 ( 51 ) = 0.25 and £> 3 ( 53 ) = 0.75. 


Let the menu M = {o}, o}, o} 0 , 0 ^, <?}• Let P be the preference relation determined by MER. 
The regret of o\^Tg is 15 (this is the regret with respect to P 2 ), and the regret of OjTg is 15.25 
(the regret with respect to pi), therefore o* w Tg >~ o^Tg. It is also easy to see that the regret of 
° 20 Tg is 15 (the regret with respect to P 2 ), and the regret of o\Tg is 21.25 (the regret with respect 
to p±), so o 20 Tg y o\Tg. Moreover, the regret of (o} 0 Ao})Tg is 15 (the regret with respect to P 2 ), 
and the regret of (oi 0 Bo’[)Tg is 15 (the regret with respect to P 2 ), so (°io Aoj)Tg >z_ ( o* 0 Bo\)Tg. 
However, the regret of (o* 2 QAo\)Tg is 16.5 (the regret with respect to pi), and the regret of 
(o^BoDTg is 16 (the regret with with respect to p$), therefore ( o 20 Aoi)Tg (o 2 QBo\)Tg. 
Thus, Axiom [5] does not hold (taking y = 01 , x = 07 , w = 010 , 2 = 020 )- 

Siniscalchi 12011, Proposition 1] proves that his notion of dynamically consistent conditional 
preference systems must essentially have beliefs that are updated by Baysian updating. How¬ 
ever, his result does not apply in our case either, because it assumes consequentialism: that 
the conditional preference system treats identical subtrees equally, independent of the greater 
decision tree within which the subtrees belong. This does not happen if, for example, we take 
forgone opportunities into account. 

There may be reasons to exclude forgone opportunities from the menu. Consequentialism , 
according to Machina [l989|, is ‘snipping’ the decision tree at the current choice node, throwing 
the rest of the tree away, and calculating preferences at the current choice node by applying 
the original preference ordering to alternative possible continuations of the tree. With this 
interpretation, consequentialism implies that forgone opportunities should be removed from 
the menu. 


Similarly, there many be reasons to exclude unachievable plans from the menu. Preferences 
computed with unachievable plans removed from the menu would be independent of these 
unachievable plans. This quality might make the preferences suitable for iterated elimination 
of suboptimal plans as a way of finding the optimal plan. In certain settings, it may be difficult 
to rank plans or find the most preferred plan among a large menu. For instance, consider the 
problem of deciding on a career path. In these settings, it may be relatively easy to identify bad 
plans, the elimination of which simplifies the problem. Conversely, computational benefits may 
motivate a decision maker to ignore unachievable plans. That is, a decision maker may choose 
to ignore unachievable plans because doing so simplifies the search for the preferred solution. 


5 Conclusion 

In dynamic decision problems, it is not clear which menu should be used to compute regret. 
However, if we use MWER with likelihood updating, then in order to avoid preference reversals, 
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we need to include all initially feasible plans in the menu, as well as richness conditions on the 
beliefs. Another, well-studied approach to circumvent preference reversals is sophistication. A 
sophisticated agent is aware of the potential for preference reversals, and thus uses backward 
induction to determine the achievable plans, which are the plans that can actually be carried 
out. In the procrastination example, a sophisticated agent would know that she would not 
study the second day. Therefore, she knows that playing on the first day and then studying on 
the second day is an unachievable plan. 


Siniscalchi 2011 considers a specific type of sophistication, called consistent planning , based 
on earlier definitions of Strotz 11955] and Gul and Pesendorfer [2005]. Assuming a filtration 


information structure, Siniscalchi axiomatizes behavior resulting from consistent planning using 
any menu-independent decision ruleQ With a menu-dependent decision rule, we need to consider 
the choice of menu when using consistent planning. Hayashi [2009 axiomatizes sophistication 
using regret-based choices, including MER and the smooth model of anticipated regret, under 
the fixed filtration information setting. However, in his models of regret, Hayashi assumes that 
the menu that the DM uses to compute regret includes only the achievable plans. In other 
words, forgone opportunities and those plans that are not achievable are excluded from the 
menu. It would be interesting to investigate the effect of including such in the menus of a 
sophisticated DM. A sophisticated decision maker who takes unachievable plans into account 
when computing regret can be understood as being “sophisticated enough” to understand that 
her preferences may change in the future, but not sophisticated enough to completely ignore 
the plans that she cannot force herself to commit to when computing regret. On the other 
hand, a sophisticated decision maker who ignores unachievable plans does not feel regret for 
not being able to commit to certain plans. 


Finally, we have only considered “binary” menus in the sense that an act is either in the 
menu and affects regret computation, or it is not. A possible generalization is to give different 
weights to the acts in the menu, and multiply the regrets computed with respect to each act by 
the weight of the act. For example, with respect to forgone opportunities, “recently forgone” 
opportunities may warrant a higher weight than opportunities that have been forgone many 
timesteps ago. Such treatment of forgone opportunities will definitely affect the behavior of the 
DM. 


A 


Proof of Theorem 


4.1 


We restate the theorem (and elsewhere in the appendix) for the reader’s convenience. 

Theorem 14.11 For a dynamic decision problem D, if and /a(h) = M for some fixed menu 
M, then there will be no preference reversals in D. 


Proof. Before proving the result, we need some definitions. Say that an information set I refines 
an information set I' if, for all liSl, some prefix h! of h is in V. Suppose that there is a history 
h such that f,g € Mh and 1(h) = I. Let fig denote the plan that agrees with / at all histories 
h! such that I(h') refines / and agrees with g otherwise. As we now show, fig gives the same 


1 Siniscalchi considers a more general information structure where the information that the DM receives can 


depend on her actions in an unpublished version of his paper Siniscalchi 2006 
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outcome as / on states in E = E(h ) and the same outcome as g on states in E c ; moreover, 

fig G M h . 

Suppose that s(h) = s and that s G E. Since E(h) = E, there exists a history h! G 7(/i) 
such that s(/i / ) = s' and R(h') = R(h). Since /, <7 G Mh , there must exist some /c such that 
f k ((s)) = g k ((s)) = h (where, as usual, f°((s)) = (s) and for A/ > 1 , f k '((s)) = /(/ fc,- 1 ((s)))). 
We claim that for all k! < k , f k ((s')) = g k ((s')), and / fc ((s')) is in the same information set 
as f k ((s)). The proof is by induction on k!. If k 1 = 0, the result follows from the observation 
that since (s) is a prefix of h, there must be some prefix of h! in I((s)). For the inductive step, 
suppose that k! > 1. We must have f k ((s)) = g k ((s)) (otherwise g would not be in Mh). Since 
g k l ((s)) = f k l ((s)) and f k ~ 1 ((s')) = g k l ((s')) are in the same information set, by the 
inductive hypothesis, g must perform the same action at g k _ 1 ((s)) and g k _ 1 (( s')), and must 
perform the same action at / fc_ 1 ((s)) and f k ~ 1 ((s')). Since g k ((s)) and f k ((s)) are both 
prefixes of h, g and / perform the same action at f k _ 1 ((s}) = g k _ 1 ((s)). It follows that / 
and g perform the same action at / fc,_ 1 ((s / )) = S k,1 (( s ')), and so f k '(( s ')) = 9 k '((s'))- Thus, 
g k ((s')) must be a prefix of b!, and so must be in the same information set as f k ((s)). This 
completes the inductive proof. 

Since f k ((s')) = g k ((s')) = h!, it follows that f k ((s')) = (/ 1g) k ((s')). Below /, all the infor¬ 
mation sets are refinements of I, so by definition, for k’ < k, we must f k ((s')) = (flg) k ((s')). 
Thus, / and fig give the same outcome for s' , and hence all states in E. Note it follows that 
( flg) k ((s)) = h, so fig G M h . 

For s’ £E and all k!, it cannot be the case that I((flg) k ((s'))) is a refinement of I, since 
the first state in (flg) k ((s'))) is s', and no history in a refinement of I has a first state of s'. 
Thus, flg k ((s')) = g k ((s')) for all k', so / and fig give the same outcome for s', and hence 
all states in E c . 

Returning to the proof of the proposition, suppose that / G C^ L) h(Mh), h' is a history 
extending h, and / G M^■ We want to show that / G C^h'(Mh')- By perfect recall, E(h') C 
E(h). Suppose, by way of contradiction, that / ^ C ^ (M^)■ Since / G C^y(AIh’), we cannot 
have E(h') = E(h), so E(h') C E(h). Choose f G C^^(Mw) and g G ’c^ E ( h ')°rE(h)( M h') 
(note that C^ E{ h,)(M h ') ^ 0 and C^ E ( h ') c rE(h)(Mh') + 0 by Axiom 3 ). Since f',g G M h > (by 
Axiom 3 ), f'I(h')g is in My- Since f'I(h')g and f, when viewed as acts, agree on states in 
E(h'), we must have f'I(h')g G C fliE (h')(Mh l ) by Axiom[2j Similarly, since f'I(h')g and g, when 
viewed as acts, agree on states in E(h') c n E(h), we must have fl(h')g G Cfi jE (h') e nE(h)(^h') ■ 
Therefore, by Axiom[lJ f'I(h')g G C^^(Mh')■ Also by Axiom[lJ since / ^ C tli h i (Mh i ), we must 
have / ^ C fl) h(Mh'). By Axiom |4j this implies that / ^ C^^fMh) (since M^ C Mh), giving us 
the desired contradiction. □ 


B 


Proof of Theorem 


4.3 


Theorem 14.31 IfV + is a set of weighted distributions on (S, X) such that C(V + ) is closed, 
then the following are equivalent: 

(a) For all decision problems D based on (S', S) and all menus M in D, Axiomshold for 
choice functions represented byV + \ l E (resp., V + \ P E). 
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(b) For all decision problems D based on (S, E), states s E S, and acts f E the weighted 

regret of f with respect to M^ and V + is separable. 

We actually prove the following stronger result. 

Theorem B.l. If V + is a set of weighted distributions on (S, E) such that C(V + ) is closed, 
then the following are equivalent: 

(a) For all decision problems D based on (S, E), Axiomshold for menus of the form M^ 
for choice functions represented by V + \ l E (resp., V + \ P E). 

(b) For all decision problems D based on (S, E) and all menus M in D, Axioms^^ hold for 
choice functions represented byV + \ l E (resp., V + \ P E). 

(c) For all decision problems D based on (S, E), states s E S, and acts f E M/ s \, the weighted 
regret of f with respect to M^ and V + is separable. 

(d) For all decision problems D based on (S', E), menus M in D, and acts f E M, the weighted 
regret of f with respect to M and V + is separable. 

Proof. Fix an arbitrary state space S, measurable events E,F C S, and a set V + of weighted 
distributions on (S, E). The fact that (b) implies (a) and (d) implies (c) follows immediately. 
Therefore, it remains to show that (a) implies (d) and that (c) implies (b). 

Since the proof is identical for prior-by-prior updating (| p ) and for likelihood updating (| ; ), 
we use | to denote the updating operator. That is, the proof can be read with | denoting | p , or 
with | denoting | ; . 

To show that (a) implies (d), we first show that Axiom [I] implies that for all decision 
problems D based on (S, E), menu M in D, sets V + of weighted probabilities, and acts / E M, 

re-Hr^if) > sup a (Pr (E n F)reg^ l{EnF \f) + Pr {E c n F)reg V A ^ {E nF) (/)) . (1) 

(Pi,a)ev+ v 7 

Suppose, by way of contradiction, that 0 does not hold. Then for some decision problem D 
based on ( S , E), measurable events E,F C S, menu M in D, and act / E M, we have that 

refl^ +|F (/) < sup a (P r(E n F)reg E ^ (ErE) (/) + Pr {E c n F)reg V ^f EL ^ F) {f) \ . 

We define a new decision problem D' based on (S, E). The idea is that in D' , we will have a 
plan ap such that ap E C((f,’ E rF (A / I") and ap E C^,'^, crF (M") and ap ^ Cap’f (M") for 
some M" C AT, where M’ is the menu at the initial decision node for the DM. 

We construct D' as follows. D' is a depth-two tree; that is, nature makes a single move, 
and then the DM makes a single move. At the first step, nature choose a state sEf. At the 
second step, the DM chooses from the set {a g : g E M} U {ap} of actions. With a slight abuse 
of notation, we let a g also denote the plan in T' that chooses the action a g at the initial history 
(s). Therefore, the initial menu in decision problem D' is M' = {a g : g E M} U {ap}. 
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The utilities for the actions/plans in D' are defined as follows. For actions {a g : g 6 M}, 
the utility of a g in state s is just the utility of the outcome resulting from applying plan g in 
state s in decision problem D. The action ay has utilities 

( ( S sup <?eM u(g(s)) - reg V ^ {EnF \f) if s <E E n F 

«(<*/'(«)) H . V+\(E^F) fn . f cpcn7? 

lsup 9eM u(g(.s)) - reg M (/) if s € E c n F. 

For all states s £ F, we have that u(ay(s)) < sup ggM u(g(s)). As a result, for all states 
s E F, we have that 


sup u(g(s)) = sup u(a g (s)). 

g£M a g £M' 


Since the regret of a plan in state s depends only on its payoff in s and the best payoff in s, it 
is not hard to see that the regrets of a g with respect to M' is the same as the regret of g with 
respect to M. More precisely, for all g G M, 

™>£ l(EnF) K) = 

re 9M’^ E nF \ a 9 ) = re ^M +l(E rF \a), and 

re g V W F ( a g) = re 0M +|F (5)- 


By definition of ay, for each state s G E n F, we have reg M i(ay,s) = reg M ' (f ), 

and for each state s € E c n F, we have reg M ,(ay , s) = reg ^ ^ E rF \f)- Thus, for all 
Pr € V, if Pr(iF n F) / 0, then reg^^ EnF \f) = reg F { ^ EnF \f), and if Pr (E c (IF) / 0, 
then reg V ^ E nF \f ) = reg* M ^ E nF \f)- If for all (Pr, a) € V + \(E n F), aPr (E n F) = 0, 
then reg^j,^ EnF \ay) = reg'^ I ,^ EnE \af) = 0. Otherwise, since there is some measure in 
V + \(E n F ) that has weight 1, we must have reg^,^ EnF \ay) = reg F 4 ,^ EnF \af). Similarly, 
reg M > («/') = r eg M ' («/)• Thus, 


reg VF ) F (ay) = sup (PriQ)e7 ,+ a (Pr (E 0 F)reg^ +l(EnF) (f) + Pr(E c 0 F)reg 1 ^ l{E ° F) (/)) 

I 7T 

> reg M (/) [by assumption] 

= reg F I ,^ F (af) [by construction]. 

Therefore, we have ay € C r R e f, ’,EnF({ a f'’ a f})’ a F e C]/, ,E c nF^ a f'^ a /})’ an< ^ a r CM’\f ({ a /')°/})) 
violating Axiom [T] 

By an analogous argument, we show that the opposite weak inequality, 
re i +|F (/) < sup a (Pr (E n F)reg E p (EnF) (/) + Pr (E c 0 F)reg P A ^ {ECr ' F) (/)) , (2) 

(Pr ,a)£V+ V 7 


is also implied by Axiom [I] Suppose, by way of contradiction, that ([2]) does not hold. Then 
for some decision problem D based on (S', £), measurable events E,F C S, menu M in D , and 
act / € M, we have that 

reg^^tf) > sup a (Pr (E n F)reg V ^ EnF \f) + Pr (E c n F)reg V ^^ E nF) (/)) . 

(Pr,a)£'P+ V ’ 7 
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We define a decision problem D' based on (S', S) just as in the previous case. Specifically, we 

, ,, , P+\(EnF), s V+\{EnF), x , ,, , P+|(E c nF), x V+\{E^F), x 

have that reg M , \ a f) = re 9M' (a/), aR d that re 9M' \ a f) = re 9M' \ a f)- 

The one difference from the previous case is that we now have 


reg^, lF (a f ) = sup (Pria)eP + a (Pr(E n F)reg^ i(EnF) (/) + Pr (E c n F)reg 1 ^ ](E nF) (/)) 

T>~ > r I /? 

< reg M (/) [by assumption] 

= reg^,\ F (cif) [by construction]. 

Therefore, we have aj G Cy? \EnF^ a f'-’ a /})> a / e Cff, ,E c nF^ a f '' a /})’ and aj ^ C M 9 ,’ F ({ af,aj }), 
violating Axiom [lj 

To complete the proof that (a) implies (d), we show that Axiom [I] also implies that for all 
decision problems D based on (S', £), menus M in D, sets V + of weighted probabilities, and 
acts / G M, if re<?^ K Eni? )(/) > 0, then 


reg V M W (f) > sup a Pr(£ c n F)reg 1 ^ l{E nF \f). 
(Pr ,a)eP+ 


(3) 


Suppose, by way of contradiction, that ([3]) does not hold. Then for some decision problem D 
based on (S, £), events E, F C S, menu M in D, and act / 6 M such that reg F f l( EnF ">(f ) > o 
and 

reg^ ]F {f) < sup aFr(E c nF)reg 1 ^ ](E nF \f). 

(Pr,ct)e'P+ 

We now define a new decision problem D' based on (S, £). The idea is that in D' , we have a 
plan a,f such that a/ ^ C‘met\f(M') but af 6 C r ^ F (M') for some M' C M. 

Construct D' exactly as before. That is, in the first step, nature chooses a state sgS, and 
in the second step, the DM chooses from the set of actions/plans M' = {a g : g G M} U {cy}. 
For each g 6 M, define the actions a g as before. We define a new action ay with utilities 

, ( \\ _ f su P 9 eM«(ff('S)), if s € ECF 

u(a g '(s )) — < . .. p+|(E c nF)/ ,x . f c 

[sup geM 7 /(g(s)) - re# M ; (/), lfseFnP. 

It is almost immediate from the definition of ay that we have 

re 9 V w F ( a g') = SU P a fP r (E c C F)reg V M^ E nF \f )) > reg V ^} F (a/). 

(Pr,a)e'P+ V 7 


However, we also have 


re^ l(BnF) K) = 0 < re 5M^ (EnF) ( a /)- 


Therefore, we have aj C™ 9 ' EnF ({a g ', cy}) but af € C r FFF ({ay, ay}), violating Axiom 1 
We next show that (c) implies (b). Specifically, we show that SEP for the initial menus of 
all decision problems D is sufficient to guarantee that Axioms 00 hold for menu M and all 
choice sets M' C M. It is easy to check that Axioms [2]^4] hold for MWER, so we need to check 
only Axiom [l] 
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Consider an arbitrary decision problem D, menu M in D , M' C M, and a plan / in M'. 
We construct a new decision problem D' such that the initial menu of D' is “equivalent” to M. 
Just as before, let D' be a two-stage decision problem where in the first stage, nature chooses 
s £ S, and in the second stage, the DM chooses from the set Mq = {a g : g £ M}, where a g is 
defined as before. Again, we associate each action a g with the plan that chooses a g in D'. Mq 
is then “equivalent” to M in the sense that 

™C' iEnF) M = -A +l(EnF, (9). 

re 9Mo {E nF) ( a g ) = re 9M^ E nF) (9), and 
re 9 1 M 0 lF ( a g ) = reg^ lF (g). 

Suppose that / £ C^ 9 FnF (M') and / £ C™/ FcrF (M'). This means that for all g £ M' , we 

have reg M(j \a f ) < re£ Mo lv (a g ) and reg Mo ' y '(a/) < re^ Mo lv T<%). Therefore, 

we have 

reg V M W {f) = reg V M 0 W {a f ) 

= sup a (Pr (E n F)reg^ l(EnF} (a f ) + Pr {E c n F)reg V ^ {E nF) (a/)) 
(Pr,a)eP+ V J 

< sup a (Pr (E D F)reg F ^ l(EnF) (a g ) + Pr (E c n F)reg 1 ^ l(E nF) (a<?)) 
(Pr,a)eP+ V J 

= reg V M W {g), 

which means that / £ C r ^ F (AT), as required. 

Next, consider an act g £ M’ such that g ^ C^ 9 ' FnF (M'). This means that reg F d ^ ErF \a,f) < 

re 9 P MQ ( ' mF) i a g) and re 5Mo l(E nF) ( a /) ^ ^Mo 1 ^ nF) K)' Let («Pr*> Pr *) e C(V + ) be such 
that 

a Pr *(Pr*(£ n F)regH o l{EnF \a g ) + Pr*(£ c n F)reg^J {ECnF \a g ) 

= sup (Pr!a)eP+ a (Vr (E n F)reg^J {EnF \a g ) + Pr {E c n F)reg F ^ {E nF) (a g )) . 

Such a pair (a Pr *, Pr*) exists, since we have assumed that C(V + ) is closed. If a Pr »Pr*(£'nF) = 
0, then reg^ Io ^ F (a g ) = sup( Pr Q ) e p+ a (Pr(E c n F)reg J ^ r J^ E nF \a g ) S j. By separability, it must 


be the case that reg F I J ( ' EnF \a g ) = 0, contradicting our assumption that 0 < reg f Mo ^ ya ' ' r ’(aj) < 
r eg F I J( ErF ' > (dg). Therefore, it must be that a Pr » Pr*(T n F) > 0, and 

reg V M W (f) = reg F l lF {a f ) 

= sup a (Pr(E D F)reg^ EnF \a f ) + Pt{E c n F)reg^ l{E nF) (a/)) 
(Pr,a)eP+ V ' 

< sup a (Pr (E D F)reg E F l(EnF} (a g ) + Pr (E c n F)reg^ ECnF) {a g )) 

(Pr ,a)£V+ V J 


V+\(ECF). 


reg F p F {g), 


which means that g ^ C r F 9 ' F (AT). 


□ 
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c 


Proof of Theorem 


4.5 


To prove Theorem 4.5, we need the following lemma. 


Lemma C.l. For all utility functions u, sets V + of weighted probabilities, acts f, and menus 
M containing f, reg v ^ (/) = reg^ v+ \f). 

Proof. Simply observe that 


reglr (/) = sup a V Pi(s)reg M (f, s) 
(Pr,a)GP+ V seS J 


= sup 


(Pr,a)eP+ \ s&s 


^aPr(s)reg M (f, 


sup .... £ p(s)reg M (f,. 


{p:p<aPr,(Pr,a)g'P+} \ s g5 

= re 9 C M V+ \f)i 


by dehnition. 


□ 


The next lemma uses an argument almost identical to one used in Lemma 7 of |Halpern 
and Leung 2012 . 

Lemma C.2. If C(V + \ X F ) is convex and q is a subprobability on F not in C(fP + \ x F), then 
there exists a non-negative vector 6 such that for all (Pr, a) E V + \ X F, we have 

^aPr(s) 6 »(s) < 

s£F s£F 


Proof. Given a set V + of weighted probabilities, let C'(V + ) = {p : p 6 M^l and p < a Pr for some (Pr, a) E 
V + }. Note that an element q £ C'(V + ) may not be a subprobability measure, since we do not 
require that q(s) > 0. Since C' (V + \ X F) and {gj are closed, co nvex, and disjoint, and {(?} is 
compact, the separating hyperplane theorem 
and cGK such that 


Rockafellar 1970 says that there exist 6 £ 


0 ■ p < c for all p £ C'(V + \ X F), and 6 ■ q > c. 


(4) 


Since {aPr : (Pr, a) £ V + \ X F} C C'(V + \ X F), we have that for all (Pr, a) £ V + \ X F, 

J]aPr(s)0(s) < J^g(s)0(s). 

s£F seF 

Now we argue that it must be the case that 0(s) > 0 for all s £ F. Suppose that 9(s') < 0 for 
some s' £ F. Define p* by setting 


p*(s) = 


0 , if s ^ s' 

-FRIT’ if s = s '- 
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Note that p* < 0, since for all s E S, p*(s) < 0. Therefore, p* E C'(V + \ X F). 

Our definition of p* also ensures that 6 ■ p* = X^esP*( s )^( s ) = p*{s')0(s') = |c| > c. This 
contradicts Q, which says that 9 ■ p < c for all p E C (V + \ X F). Thus it must be the case that 
6{s) > 0 for all sES. O 


We are now ready to prove Theorem 4.5, which we restate here. 

Theorem 14.51 If C(V + ) is closed and convex, then Txiom[7] holds for the family of choices 
C r ^' V ^ E if and only ifV + is x-rectangular. 


We prove the two directions of implication in the theorem separately. Note that the proof 
that x-rectangularity implies Axiom [l] does not require C(V + ) to be convex. 

Claim C.3. IfV + is x-^ect,angular, then Axiom M holds for the family of choices C r ^' V ^ E . 


Proof. By Theorem 4.3 it suffices to show that SEP holds. For the first part of SEP, we must 
show that 


V+\ x F, f s 

reg M (/) = sup 

(Pr ,a)£V+\XF 


a 


(Pr(£ n F)reg^ lX{EnE \f) + Pr (E c n F)reg V ^ X{E nF) 


(5) 


Unwinding the definitions, © is equivalent to 

re 9 V M^ F {f) 

= su P(Pr 3 ,a 3 )eP+|xF «Pr 3 ( p r 3 (E n F) sup (Pri)ai)e7 ,+ | XF a\ EnF EseEnF Pr i( s l ( E n F )) re 9M(f > s)) 
+ Pr 3 (E c n F) sup (Pr2>a2)eF+ | XF 

a 2,E c nF ^2s£E c nF Pr 2 (s\(E c nF))reg M (f,s))y 


The sups in this expression are taken on by some (Pr*,a*), (Pr^a^), (Pr^a^) E V + \ X F. By 
X-rectangularity, we have that for all (Pri,aq), (Pr 2 ,a 2 ), ( p r 3 ,a 3 ) E V + \ X F, 

a Pr3 Pr 3 (E n F)al EnF Pn\(E n F) + a Pr3 Pr 3 (£ c n F)a* EcnF Pr 2 \{E c n F) E C(V+\ X F). (6) 

Thus, for all e > 0, 


™C XF (/) 


[by Lemma C.l 


> 


(?4(E n F)(al EnF ) x Escehf Pr!(s |(E n F))reg M (f , s)) 

+ Pr 1{E C n F)(al EcnF ) x EseE^nF ^1 (£ c D F))reg M {f , s)j) 


~e [by 
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Therefore 


re 9 V M^ F U) 

> «3 (P4(i? n F)(a* EnF )x E seEn F P^(sP □ F))re 3M (/, S )) 

+ Pr * 3 (E C n F)(a* EcnF )x £ sgFcnF P^(*l {E c D F))reg M (f, s)j) 

= su P(Pr 3 ,a 3 )eP+|xF «3 (Pr 3 (^ 0 F) sup {Pruai)eV+lxF ot* EnF Esgehf P r i(s|( s n F))reg M (f, s)) 

+ Pt 3 (E c n F) sup (Pr2ia2)eP+ | XF a\ E c PF Es&E-nF Pr 20l (E c n F))reg M (f , s))) 

[by the choice of (Pr*, a*), i = 1, 2, 3] 

= sup (Pr Q)gF+ | XF a (P r(E n F)reg F p X(EnF \f) + Pr {E c n F)reg V ^ X{E nF) (/)) , 


as required. 

It remains to show the opposite inequality in ([5]), namely, that 
re 9 1 M ]XF {f) < sup a (Pr(£? n F)reg^^ Er[F \f) + Pr(£ c n F)reg V ^ HE nF) 

(Pr, a)er+\xF v 


It suffices to note that the right-hand side is equal to 

sup( Pria)eF +| XF (a Pr (E n F) sup (PriiQl)gF +| XF a^ FnF Esgehf Pr i( s l^ n F )reg M (f , s)) 

+a Pr(£ c n F) sup (Pr2 Q2)eF+ | XF a x EcnF EseE-nF Pr 2( s l^ c n F)reg M {f, s))) 

> E v +\ XF {reg M (f)) [by rectangularity] 

= reg^ XF (f). 

This completes the proof that ([5]) holds. 

For the second part of SEP, suppose that V + (E n F) > 0 and reg V M l x ( EnF )(j) ^ o. 

If reg V M nF \f) = 0 then, since V + (E n F) > 0, we have that reg F f P F (f ) > 0 = 

sup( PrQ ) gF+ | XF aPr(E c P\F)reg^ /[ ^ ( ' F rF \f), as desired. Otherwise, by part (b) of %-rectangularity, 
for all 6 > 0, there exists (Pr, a) G V + \ X F such that a(5Pr(E'nF)-|-Pr(E c nF)) > sup( Pr / a /) gF >+ a ’ Pr 7 (.E c n 
F). Therefore, using the first part of SEP, we have 


r^; P,EnF) (f) 


su P(p r ,a)6P+|xira (Pr(E n F)reg^ X( ' EnF \f) + Pr(E c n F)reg’^\ X( ' E nF) 
reg F 4 ]X(E nF " > (/) sup (Pra)g -p+| XF a ^Pr (finFj^J + PrfBTF) 


re g vWn F){ f) 

p c n F' y \ 

> r eg M (/) sup( PriQ ,) gF +| XF aPr(E c n F) [by part (b) of x-rectangularity] 

= sup (Prja)gF +| xF aPr(E c 0 F)reg'^ lX{E nF) (/), 


as required. □ 

Claim C.4. If C(V + ) is convex and Txiom[i] /io/ds for the family of choices C^f’ F ^ E , then 
V + is x-rectangular. 
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Proof. Suppose that %-rectangularity does not hold. Then one of the three conditions of reef- 
angularity must fail. 

First suppose that it is (a); that is, for some (Pri, ai), (Pr 2 , 0 : 2 ), (Pr 3 , a 3 ) G V + , we have 
Pri(E n F) > 0 and Pr 2 (E c flF)>0 and 


« 3 Pr 3 (£ n F)a x EnF Pn|(f? n F) + a 3 Pr 3 (£ c n F)a x EcnF Pr 2 \(E c flF) ^ C{V+)\ X F. 

Let p* = a 3 Pr s(E n F)a x EnF Pri\(E flF) + a 3 Pr 3 (E c n F)a^ £ CriF Pr 2 |(-E c n F). Since we have 
assumed that C(V + ) is convex, we have that C(V + \ X F ) is also convex. By Lemma C.2, there 
exists a non-negative vector 9 such that for all aPr 6 C(V + \ X F), we have 


^aPr(s)6»(s) < ^p*(s)0(s). 

s£F seF 


We construct a decision problem D based on (S', S). D has two stages: in the first stage, 
nature chooses a state s 6 S, but only states in F C S are chosen with positive probability, 
so when the DM plays, his beliefs are characterized by V + \ X F. In the second stage, the DM 
chooses an action from the set M = {f,g}, with utilities defined as follows: 


u(f,s) = —0{s), and 
u(g, s) = 0 for all s. 


The act / will have regret precisely 6(s) in state s G S. By Lemma C.2 


suP(p r ,a)&p+ a (Pr (E n F)reg 1 ^ 1 * (Eni?) (/) + p T (E C n F)reg V ^ X(E nF) 
> apr 3 (pr 3 (^ n F)reg<^ Pl1 '^(Z) + Pr (E c n F)reg$ EenFPl2 

= E seFP*( s ) 0 ( s ) 


(/) 


. V+\ X F, 

> su P(Pr,a)e7 , +|xF re 9M (/)> 


violating SEP. By Theorem |4.3[ Axiom [l] cannot hold. 

Now suppose that condition (b) in rectangularity does not hold. That is, for some 5 > 0, 
for all (a, Pr) G V + , a(5Pr(EnF) + Pr(E c nF)) < sup( Pr / >a Q e -p+ a' Pr'(E c n F). We construct 
a decision problem D based on (S, X). D has two stages: in the first stage, nature chooses a 
state s G S. In the second stage, the DM chooses an action from the set M = {f,g}, with 
utilities defined as follows: 

u(f, s) = 0 for all s G S, 
u(g, s) = —5 if s G E n F 
u(g, s) = — 1 if s ^ E n F. 

Then we have that reg 1 ^ I x ( EnF )^g ) _ j anc [ reg E 4 ^ X<yE nF \g ) = 1. Using SEP and the choice 
of 5, we must have 

reg^ E (g) = sup (Pr Q , )gP +|x_F a(Pr(E n F)6 + Pr(E c n F)) 

< sup PreV +\ XF aPr(E c n F)reg^ X{E nE \g). 
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Clearly, 


Thus, 


reg^ lXF (g)> sup aPr (E c Pi F)reg^ lX(E nF) (g). 

(Pr,a)S'P+| x -F 

re ffM +|XF (5) = sup aPr (E c nF)reg^ lX(E nF \g), 

{Pr,a)€V+\xF 

violating the second condition of SEP. Therefore, by Theorem 4.3 Axiom [T] does not hold. 

Finally, suppose that condition (c) in rectangularity does not hold. Then for some nonneg¬ 
ative real vector 8 e 


su P(Pr,a)eP+|xF (a p r(-E) sup (Pri!Ql)G 7 >+|x(i?nF) T.SGEHF «i Pr i (s\E)8(s)) 

+a Pr (E c ) sup {Pr2iQ2)e7 ,+ | X(EcnF) J2 seE o nF «2 Pr 2 (s|E c ) 6 '(s))) ( 7 ) 

< sup (Pr!Q)eP+ | XF a XUf Pr(s) 6 »(s). 

We construct a decision problem D based on (S, E). D has two stages: in the first stage, nature 
chooses a state s E S. In the second stage, the DM chooses an action from the set M = {/, g}, 
with utilities defined as follows: 

u(g, s ) = —9(s) for all s € S. 
u(f, s) = 0 for all s € S. 

So we have 

sup( Pr ,a)eP+|pF« (Pr(£ n F)reg V ^ V{mF] (g) + Pr (E c D F)reg E ^ P< ' E nF) (g)^ 

= sup( Pr Q ) g -p+| XF a (Pr (E n F)E V + \x(EnF)(0) + Fr(E c n F)E v +\ x ^ E c nF )(8 )) 

< E v+lxF (8) [by Q] 

V+\ p F, s 
= r ^ M (9>- 

This means that SEP, and hence Axiom 1, is violated, a contradiction. □ 
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